Last Update: 7/13/2025
Gemini Chat Completion API (OpenAI API)
The Gemini Chat Completion API allows you to generate conversational responses using Gemini's language models. This document provides an overview of the API endpoints, request parameters, and response structure.
Endpoint
POST https://platform.llmprovider.ai/v1/chat/completions
Request Headers
| Header | Value | 
|---|---|
| Authorization | Bearer YOUR_API_KEY | 
| Content-Type | application/json | 
Request Body
The request body should be a JSON object with the following parameters:
| Parameter | Type | Description | 
|---|---|---|
| model | string | The model to use (e.g., gemini-1.5-flash). | 
| messages | array | A list of message objects representing the conversation history. | 
| frequency_penalty | number | (Optional) Penalty for new tokens based on their frequency in the text so far. Required: -2 <= x <= 2 | 
| max_tokens | integer | (Optional) The maximum number of tokens to generate. | 
| presence_penalty | number | (Optional) Penalty for new tokens based on their presence in the text so far. Required: -2 <= x <= 2 | 
| response_format | object | (Optional) An object specifying the format that the model must output. | 
| stop | array | (Optional) Up to 4 sequences where the API will stop generating further tokens. | 
| stream | boolean | (Optional) Whether to stream the response as it is generated. | 
| temperature | number | (Optional) What sampling temperature to use, Required: 0 <= x <= 2. | 
| top_p | number | (Optional) Top-p sampling probability. Required: 0 <= x <= 1 | 
| tools | array | (Optional) A list of tools the model may call. | 
| tool_choice | object | (Optional) Controls which (if any) tool is called by the model. | 
| logprobs | bool | (Optional) Whether to return log probabilities of the output tokens or not. | 
| top_logprobs | integer | (Optional) An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. Required: 0 <= x <= 20 | 
Example Request
{
  "messages": [
    {
      "content": "You are a helpful assistant",
      "role": "system"
    },
    {
      "content": "Hi",
      "role": "user"
    }
  ],
  "model": "gemini-1.5-flash",
  "frequency_penalty": 0,
  "max_tokens": 2048,
  "presence_penalty": 0,
  "response_format": {
    "type": "text"
  },
  "stop": null,
  "stream": false,
  "stream_options": null,
  "temperature": 1,
  "top_p": 1,
  "tools": null,
  "tool_choice": "none",
  "logprobs": false,
  "top_logprobs": null
}
Response Body
The response body will be a JSON object containing the generated completions and other metadata.
| Field | Type | Description | 
|---|---|---|
| id | string | Unique identifier for the completion. | 
| object | string | The type of object returned, usually chat.completion. | 
| created | integer | Timestamp of when the completion was created. | 
| model | string | The model used for the completion. | 
| choices | array | A list of generated completion choices. | 
| usage | object | Token usage statistics for the request. | 
| system_fingerprint | string | This fingerprint represents the backend configuration that the model runs with. | 
Example Response
{
  "id": "0aa5ca3a-2fc0-42c3-a247-fe5897a45d46",
  "object": "chat.completion",
  "created": 1738923553,
  "model": "gemini-1.5-flash",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I assist you today? 😊"
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 11,
    "total_tokens": 20,
    "prompt_tokens_details": {
      "cached_tokens": 0
    },
    "prompt_cache_hit_tokens": 0,
    "prompt_cache_miss_tokens": 9
  },
  "system_fingerprint": "fp_3a5770e1b4"
}
Example Request
- Shell
- nodejs
- python
curl -X POST https://platform.llmprovider.ai/v1/chat/completions \
-H "Authorization: Bearer $YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-1.5-flash",
"messages": [
  {
    "role": "user",
    "content": "Hello!"
  }
]
}'
const axios = require('axios');
const apiKey = 'YOUR_API_KEY';
const url = 'https://platform.llmprovider.ai/v1/chat/completions';
const data = {
    model: 'gemini-1.5-flash',
    messages: [
        {
            role: 'user',
            content: 'Hello!'
        }
    ]
};
const headers = {
    'Authorization': `Bearer ${apiKey}`,
    'Content-Type': 'application/json'
};
axios.post(url, data, { headers })
.then(response => {
    console.log('Response:', response.data);
})
.catch(error => {
    console.error('Error:', error);
});
import requests
import json
api_key = 'YOUR_API_KEY'
url = 'https://platform.llmprovider.ai/v1/chat/completions'
headers = {
    'Authorization': f'Bearer {api_key}',
    'Content-Type': 'application/json'
}
data = {
    'model': 'gemini-1.5-flash',
    'messages': [
        {
            'role': 'user',
            'content': 'Hello!'
        }
    ]
}
response = requests.post(url, headers=headers, data=json.dumps(data))
if response.status_code == 200:
    print('Response:', response.json())
else:
    print('Error:', response.status_code, response.text)
}
response = requests.post(url, headers=headers, data=json.dumps(data))
if response.status_code == 200:
    print('Response:', response.json())
else:
    print('Error:', response.status_code, response.text)
For more details, refer to the Gemini API documentation.